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Introduction 

Information  capacity  of  the  discrete-time  additive  Gaussian  channel  with 
feedback  is  em  open  problem.  It  has  long  been  speculated  that  causal  feedback 
can  increase  capacity.  We  give  here  sufficient  conditions  for  optimum  causal 
linear  feedback  to  increase  information  capacity  for  any  fixed  value  of  the 
constraint,  for  all  values  of  the  constraint,  and  for  all  sufficiently  large 
values  of  the  constraint. 

A  special  case  of  these  results  is  for  the  finite-dimensional  channel 
with  a  pure  power  constraint.  The  method  developed  here  gives  the  solution  for 
that  case  in  a  particularly  easy  fashion;  see  [1]. 

Recent  work  on  the  capacity  of  feedback  channels  has  been  done  by  Ihara 
[2]  (for  the  finite-dimensional  channel)  euid  by  Cover  and  Pombra  [3]. 

Problem  Statement 

The  capacity  problem  will  be  considered  for  both  the  infinite-dimensional 
and  the  finite-dimensional  discrete-time  additive  Gaussian  channel.  However, 
the  setup  will  be  given  only  for  it  will  be  seen  (by  substituting  IR  for 
that  the  procedure  also  applies  without  change  to  IR  ,  although  of  course 
the  finite-dimensional  channel  is  much  simpler  and  does  not  require  the  full 
development  given  here. 

All  stochastic  processes  are  defined  on  a  probability  space  (n,/3,fi);  £(•) 

will  denote  expectation  with  respect  to  fi.  Ilxll  will  denote  the  norm  of  the 
vector  x:  llxll^  =  2  [x(n)]^. 

n^l 

The  channel  output  is  Y  =  X  -  BY  +  N.  where  N  is  additive  zero-mean 
Gaussian  noise  with  strlctly-posi tlve  trace-class  covariance  matrix  Rj^,  X  is  a 
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message  process  independent  of  N,  and  B  is  a  Hilbert-Schmidt  strictly-lower- 

triamgular  (HSSLT)  matrix:  2  b?  <  «  and  b  =  0  for  j  ^  i.  all  i  >  1.  The 

i.j>l 

mutual  information  of  interest  is  that  between  X  eind  Y,  denoted  I(X,  Y) .  The 
constraint  will  be  given  in  terms  of  a  trace-class  covariance  matrix  R^.  Any 
constraint  must  imply  a  constraint  of  this  form  if  the  capacity  is  to  be 
finite  [4].  The  class  of  admissible  message  processes  X  and  HSSLT  matrices  B 

X 

consist  of  all  X  such  that  almost  all  sample  paths  of  X  belong  to  range(R^), 

range(B)  is  contained  in  range(R^).  and  EIIX-BYII^  i  P.  where  llull^  =  IIR^^ull^. 

The  capacity  is  then  the  supremum  of  the  mutual  information  I(X.Y)  over  all 
such  admissible  pairs  (X.B). 

The  feedback  capacity  will  be  denoted  by  C^{P) .  The  capacity  of  this 
channel  without  feedback  is  for  the  case  B  =  0,  so  that  the  constraint  is 
EIIXII^  i  P.  This  capacity  will  be  denoted  by  C^(P). 

The  assumptions  that  Rj^  and  R^  are  strictly-positive  can  be  dropped. 

Attention  can  be  restricted  to  range(Rj^);  in  order  to  have  finite  capacity. 

one  must  then  have  that  R^  is  strictly  positive  as  an  operator  in  r8uige{Rj^); 
see  [4]  for  details.  However,  without  loss  of  generality,  it  is  assumed  here 
that  both  R^  and  Rj^  are  strictly  positive. 


Preliminaries 

This  section  contains  several  mathematical  definitions  and  small  results 
that  will  be  needed  to  prove  the  main  result.  It  will  be  seen  that  much  of 
this  is  obvious  when  one  treats  IR^. 

denote  the  set  of  all  Hilbert-Schmidt  operators  mapping 
into  #2"  A  is  in  only  if  A  has  a  matrix  representation  such  that 
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2  [A(lj)]  <  For  A,  and  A„  In  define 

i.J^l 

<Aj,A2>0  =  Trace  A^Ag 

=22  A^(ij)A2(Ji). 
i  j 

<*,*>^  defines  an  inner  product  on  and  it  is  known  that  ^2®^2  ^ 

Hilbert  space  under  this  inner  product  [5].  Moreover,  convergence  of  a 
sequence  (A^)  in  ^2®^2  ^  element  A  in  ^2®^2  that  IIA^-AII  -♦  0  and 

thus  A^x  -»  Ax  for  all  x  in  €^. 

(6  )  will  denote  the  natural  basis  vectors  in  6  (i)  =  0  for  i  ^  n, 
6^{n)  =  1.  Let  =  span{6^,  i  ^  n}  and  denote  by  the  projection  operator 
with  range  space  is  a  dieigonal  matrix  with  P^(i,i)  =  1  for  i  ^  n; 

P^(i ,i)=Ofori>n. 


Lemma  1 :  Let  be  the  matrix 

P{J(ij)  =  R^(ij)  for  i  $  n,  j  ^  n 

=  0  otherwise.  Then: 

,n  ..  ..w 


(1)  R„  =  V  V  for  a  lower  triangular  matrix  V  with  V  (ii)  =  c.  for  i  <  n, 

n  n  n  n  n  i 

n  2  n 

where  U  c.  =  determinant  R^; 
i=l  ^  ^ 

(2)  V  X  =  0  for  all  x  in 

^  ^  n  n 

(3)  If  m  >  n.  V  =  P  V  ; 

^  '  n  n  m 

(4)  (^j^)  is  a  Cauchy  sequence  in 

(5)  R^  =  W  ,  where  V  is  a  unique  lower- triangular  Hi Iber t-Schmidt  matrix 

such  that  V(i,i)  =  c^  for  all  i  >  1,  auid  V  =  lim  in  the  topology  of 

Moreover,  V  =PVforn>l. 

2  2  n  n 


Proof :  Since  H^  can  be  identified  with  K*',  and  R^  with  a  covariance  matrix  in 
(1)  is  obvious,  as  is  (2).  To  see  (3),  if  i,j  <  n  and  m  >  n.  then 
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RjdJ)  =R;{iJ)  =^2j„(ik)VJjk) 


k<min( i . j) 


[P  V  ]{ik)[P  V  ]{jk). 


Since  P  V  is  lower  triaingular,  P  V  (ii)  =  V  (ii)  =  V  (ii)  for  all  i  <  n,  eind 
n  m  n  m  '  m  '  n 

the  factorization  =  V  V**  is  unique  when  the  diagonal  elements  of  V  are 

w  n  n  n 

fixed  [7],  it  follows  that  V  =  P  V  . 

n  n  m 


For  (4),  note  that  =  P^R^P^.  If  m  >  n.  then 


..2 


Trace  (V  -V  )(V  -V  )  =  2  ll(V  -V  )  5.ir  =  2  IIV  (P  -1)6. ir  = 

'  n  m^^  n  m^  n  m'  j  m^  n  '  j 


2 

2  IIV  6.11 
=n+l  ■"  J 


2  <P  R,„P  6.,6.>  =  2  <R„6.,6.>.  Since  R„  has  finite  trace,  this  sum 

j=n+l  J  J  j=n+l  ^  J  J  ^ 

converges  to  zero  as  m.n  ->  showing  that  (V^)  is  Cauchy  in 

To  obtain  (5),  we  first  recall  that  convergence  in  ^2®^2  norm 

convergence  to  the  same  limit  [5],  so  there  exists  by  (4)  a  Hi Ibert-Schmidt 

operator  V  such  that  V  -»  V  in  both  and  operator  norm.  V  V  must  then 

n  ^  z  n  n 

converge  to  W**  in  the  operator  norm  topology.  However,  =  «;  =  ^^^n-  - 

V  V  converges  to  R™  in  operator  norm.  Since  the  set  of  bounded  linear 
n  n  w 

operators  on  ^  Banach  space  under  the  operator  norm  [6],  a  Cauchy 

sequence  has  a  unique  limit,  and  this  gives  R^  =  W  .  V  is  necessarily 

Hi Ibert-Schmidt ,  since  R^  is  trace-class.  To  see  that  V  must  be  lower- 

triangular,  note  that  Tr  {V^-V)(V^-V)  -»  0,  euid  Tr  {V^-V){V^-V)  = 

2  (V  (i j)“V(i j))^.  Since  V  (ij)  =  0  for  j  >  i  and  all  n  >  1,  it  follows  that 
1 .  j  " 

V(ij)  =  0  for  j  >  i  >  1-  The  same  relations  show  that  V{ii)  =  c^  for  all 

1  >  1.  Since  P  W**P  =  V  V**,  we  proceed  as  before  to  obtain  V  =  P  V. 

nnnn  nn 


has  finite  trace,  this  sum 


If  u  and  V  are  two  vectors  in  82’  then  u®v  is  defined  to  be  the  element 
of  ^2®^2  by  (u®v)x  =  <v,x>u. 
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In  order  that  the  capacity  without  feedback  be  finite,  it  is  necessary 

X  X 

where  S  is  a  self-adjoint  operator  in  6^ 
such  that  (I+S)  ^  exists  2ind  is  bounded  [4],  The  limit  points  of  the  spectrum 
of  S  consist  of  all  real  numbers  X  such  that  X  is  either  an  eigenvalue  of  S  of 
infinite  multiplicity,  or  the  limit  of  a  sequence  of  distinct  eigenvalues,  or 
a  point  of  the  continuous  spectrum  (i.e..  (S-XI)  ^  exists  and  is  densely- 
defined  but  not  bounded).  The  set  of  limit  points  of  S  is  not  empty.  For 
discussion  of  these  and  related  facts,  see  [6].  0  will  be  used  to  denote  the 
smallest  limit  point  of  the  spectrum  of  S.  As  in  [4],  will  always  be  used 

to  denote  the  sequence  of  eigenvalues  of  S  that  are  strictly  less  than  0;  they 
are  ordered  by  X^  <  Xg  <  • -  <  0.  and  repeated  in  the  sequence  according  to 
their  multiplicity.  Of  course,  there  may  not  be  any  eigenvalues  strictly  less 
than  0.  If  {X^,  n  ^  1}  is  not  empty,  then  {e^.  n^l}  will  denote  orthonormal 
eigenvectors  of  S  corresponding  to  the  eigenvalues  Se^  =  n  ^  1. 

With  =  W^.  =  VL**  for  L  a  unitary  operator  [7].  Since  I  +  S  = 

R^^Rjj^R^^  (on  the  range  of  R^) .  L**(I+S)L  =  I  +  L**SL  =  V  As  L  is 

^  _ 1  2 

unitary,  L  SL  has  the  saune  spectrum  as  S,  and  so  V  Rj^V  has  the  same 

spectrum  as  I  +  S.  Thus,  1  +  0  is  the  smallest  limit  point  of  the  spectrum  of 

V  Rj^V  and  k  >  1}  are  the  eigenvalues  of  V  R^^^V  that  are  strictly 

less  than  1+0. 

Main  Result 

Theorem:  Let  V  be  a  lower-triangular  matrix  such  that  R^  =  W  .  Fix  P  >  0. 

(1)  For  the  K-dimensional  channel,  let  <  P2  i  be  the  eigenvalues 

J 

of  S.  with  J  the  largest  integer  $  K  such  that  J/3 .  <  P  +  I  P.. 

i  =  l  ^ 


and  sufficient  that  Rj^  =  R^(I+S)R^ 
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-1  »*-l 

C^(?)  >  eigenvectors  of  V  Rj^V  corresponding  to 

the  sequence  of  eigenvalues  (1  +  i  <  J)  does  not  contain  J  natural 
basis  vectors. 

(2)  For  the  infinite-dimensional  channel.  C^(P)  >  C^(P)  if  the  following 
conditions  are  satisfied: 

(a)  k  ^  1}  is  not  empty; 

J 

(b)  If  there  exists  a  largest  integer  J  such  that  JX .  <  P  +  2  X. .  then 

i=l  ^ 

—  1 

Cy^(P)  >  eigenvectors  of  V  corresponding 

to  the  sequence  of  eigenvalues  (1  +  X^^,  k  <  J)  does  not  contain  J 

natural  basis  vectors. 

J 

or  (b')  If  JXj  <  P  +  2  iet  21II  Xj,  then  C^(P)  >  C^{P)  if  the  subspace 

““1  M—l 

spanned  by  the  eigenvectors  of  V  Rj^V  corresponding  to  the 
sequence  of  eigenvalues  (1  +  k  >  1)  does  not  contain  a  set  of 
natural  basis  vectors  that  is  complete  for  that  subspace  and  which 

”  1  J 

are  eigenvectors  of  V  Rj^V 


Corollary: 


(1)  For  the  finite-dimensional  discrete-time  channel,  causal  linear  feedback 

can  increase  information  capacity  for  all  P  >  0  if  the  subspace  spanned 

-1  w-1 

by  the  eigenvectors  of  V  R^V  corresponding  to  its  smallest  eigenvalue 

does  not  contain  a  basis  consisting  entirely  of  natural  basis  vectors 

-1  w-1 

which  are  eigenvectors  of  V  Rj^V  .  Causal  linear  feedback  can  increase 

—  1  1 

capacity  for  all  sufficiently  large  P  if  and  only  if  V  R^^V  is  not 
diagonal . 

(2)  For  the  infinite-dimensional  discrete-time  channel,  causal  linear 
feedback  can  increase  capacity  for  all  P  >  0  if  {X^,  k  >  1}  is  not  empty 
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and  the  subspace  spanned  by  the  eigenvectors  of  V  RpjV  ‘  corresponding 

to  its  smallest  eigenvalue  does  not  contain  a  basis  consisting  entirely 

-1  »♦— 1 

of  natural  basis  vectors  which  are  eigenvectors  of  V 

C^{P)  >  ^(P)  for  0.11  sufficiently  large  P  if  k  >  1}  is  not  empty 

and  the  subspace  spanned  by  the  eigenvectors  corresponding  to  the 
eigenvalues  {1  +  k  >  1}  of  V  does  not  contain  a  basis  for  the 

subspace  consisting  entirely  of  natural  basis  vectors  which  are 
eigenvectors  of  V  . 

Remark :  The  sufficient  condition  in  (2)  giving  C^(P)  >  C^(P)  for  all 

sufficiently  large  P  is  equivalent  to  the  following  statement:  {X^^,  k  >  1}  is 

“1 

not  empty,  and  the  restriction  of  V  Rj^V  to  the  subspace  spanned  by  the 
-1  >*-l 

eigenvectors  of  V  Rj^^V  corresponding  to  the  eigenvalues  {1  +  Xj^,  k^l}is 
not  a  diagonal  matrix. 

The  results  stated  in  (1)  of  the  Qjrollary  were  proved  in  [1],  where  the 
development  is  much  streamlined  because  of  the  simpler  nature  of  the  finite- 
dimensional  problem.  That  work  used  R^  =  I.  For  the  same  finite-dimensional 
channel  and  constraint.  Ihara  has  obtained  the  result  that  capacity  is 
increased  for  all  sufficiently  large  P  if  Rj^  is  not  diagonal  [2],  although  his 
result  is  stated  in  a  different  form;  his  methods  are  quite  different  from 
those  used  here.  He  also  gives  as  a  sufficient  condit  ion  for  C^{P)  >  C^(P) 
for  all  P  >  0  the  condition  that  (in  the  terminology  used  here)  R^  has  no 
natural  basis  vectors  as  eigenvectors.  The  corresponding  sufficient  condition 
given  in  (1)  of  the  Corollary  is  much  weaker. 
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Reformulation  of  the  Problem 


In  this  section,  the  original  linear  feedback  problem  is  converted  into 


an  equivalent  no-feedback  problem.  Originally,  Y  =  X  -  BY  +  N,  where  the 


matrix  B  is  HSSLT.  (I+B)  exists,  since  B  can  have  no  non-zero  eigenvalues; 


(I+B)  is  bounded,  since  B  is  compact  (and  thus  has  only  zero  as  a  limit 


point  of  the  spectrum).  Thus,  Y  =  (I+B)  +  (I+B)  ^N.  Since  (I+B)  ^  is  l^l, 


I(X,  X+N)  =  I(X,  (I+B)  (X+N))  =  I(X,  Y). 


Of  course,  the  constraint  ElIX-BYII^  ^  P  is  the  same  as  EIIX-B(I+B)  ^ (X+N) II?  <  P. 


Using  =  W  wi th  V  Hi Iber t-Schmidt  and  lower  triangular,  the  constraint  can 


be  written 


P  >  EIIV'^X  -  V  4(I+B)~\x+N)II^ 


P  >  EIIZ  -  D(V’^Z+N)II^. 


where  D  =  V  ^B(I+B)  ^  and  Z  =  V  ^X.  D  is  well-defined  and  bounded,  since 


range(B)  C  reuige(V)  auid  (I+B)  is  bounded.  Moreover,  since  B  is  HSSLT  and  both 


V  ^  and  (I+B)  ^  are  lower  triangular,  D  must  be  strictly  lower  triangular  and 


bounded  (BSLT) . 


The  feedback  capacity  problem  under  our  initial  assumptions  thus  becomes 


maximize  I(X,  X+N) 


subject  to  P  >  EIIV"^X  -  D(X+N)I1^ 


where  D  is  permitted  to  be  any  bounded  SLT  matrix  in 


This  is  actually  the  problem  that  will  be  considered  below  in  obtaining 


the  sufficient  conditions  of  the  Theorem  and  the  Corollary. 
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1  ^  A 


•  '4 


Let  set  of  all  real  random  vectors  u  on  (Q.P)  such  that 

2 

u((j)  €  S  a.e.  dP{cj)  and  E  2  [u(n,a>)]  <  is  a  Hilbert  space  under 

-  nM  ^ 

the  inner  product  (f.g)  =  E  2  f (n,w)g(n.(j) .  Let  £md  Y~  be  two  mutually 

independent  zero-mean  Gaussian  raindom  vectors  in  ^  Y^ (m,a))Y2(n,{i))  =  0  for 

all  n,m  >  1.  Suppose  that  Ry  +  is  strictly  positive.  Define  H_(Y.+Y„)  as 

the  set  of  all  elements  f  in  H(S2-H)  such  that  f  =  B(Y^+Y2^  for  some  bounded 
SLT  matrix  operator  B.  H_(Y^+Y2)  is  clearly  a  linear  manifold  in  .  To 

see  that  this  linear  manifold  is  closed,  one  notes  that  if  (b'^)  is  a  sequence 
of  bounded  SLT  operators, 

I1B"(Yj+Y2)  -  b'"(Y^+Y2)II^  =  II(b"-b'")(Yj+Y2)Ii2 

=  Trace  {B"-b"')(Ry  +R,^  )(B"-b'")^ 

1  2 

>  ii(b^-b'")(r,^  +r,^  )2i|2  >  iib’^-b’^ii^t 

1  2  ^ 

where  -Tq  is  the  smallest  eigenvalue  of  Ry  +  Ry  .  Thus,  (B^(Yj^+Y2))  Cauchy  in 

H(^2-h)  implies  that  (B*^)  is  Cauchy  in  operator  norm,  so  converges  to  a 
bounded  linear  operator  B.  To  see  that  B  is  SLT,  one  notes  that 

Ry  +  Ry  =  QQ  for  some  lower-triangular  Q,  emd  so  (Ry  +Ry  )^  =  QT  for  T 
1  2  ^1  ^2 

unitary  [8].  This  gives 

IIB''(Y^+Y2)  -  b"’(Y^+Y2)I|J  =  ll(B"-B"’)(Ry^+Ry^)^l|2 

=  ll(B'^-B"’)Qllg. 

Thus,  (B^Q)  is  Cauchy  in  so  that  BQ  must  be  strictly  lower  triangular; 
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since  Q  exists  and  is  lower  triangular,  this  shows  that  B  is  a  bounded  SLT 
matrix  operator,  so  that  H_(Y^+Y2)  is  closed. 

Now  consider  our  feedback  problem:  We  wish  to  maximize  I{X.  X+N)  subject 
to  P  >  EIIV  ^X  -  D(X+N)II^  =  IIV  ^X  -  D(X+N)II^,  where  D  is  permitted  to  be  any 

bounded  SLT  matrix.  Given  any  choice  of  D  that  satisfies  this  constraint,  we 
know  that  IIV"^X  -  D{X+N)II^  >  IIV~^X  -  P_{V^X)II^.  where  P_(V"^X)  is  the 
projection  of  V  ^X  onto  H_(X+N).  Thus,  we  can  assume  WLCX)  that  D  is  the 
optimum  bounded  SLT  matrix  for  minimizing  the  distance  in  norm  between 

V“^X  and  H_{X+N) :  D(X+N)  is  the  projection  of  V'^X  onto  H_(X+N). 

Now,  let  X  be  the  optimum  no-feedback  message  for  the  case  when  capacity 
is  attained  (assuming  here  that  can  be  attained).  As  the  message  for  the 

feedback  problem,  use  oX.  Then  I(aX,  oX+N)  >  C^(P)  if  o-  ^  1-  Choose  a  to 
satisfy  the  constraint: 

P  =  llV'^oX  -  D(aX+N)ll^ 

=  a^r  -  A, 

where  A  =  A(a)  is  the  H(^2'4)  norm  of  the  projection  of  aV  ^X  on  H_(aX+N). 
Since  Tr  V  ^  =  Tr  =  EIIXII^  =  P,  we  have  P  =  a^P  -  A,  so  that 

a  >  1  if  aV  ^X  is  not  orthogonal  to  H_(aX+N). 

— 1  — 1  M— 1 

Prop. :  aV  X  is  orthogonal  to  H_{crX+N)  if  and  only  if  V  R^V  is  diagonal. 

Proof :  Since  X  is  independent  of  N,  (aV  ^X,  D(aX+N))^  =  a^r  DR^^V  ^  = 

a^r  DV[V"^R^V^"^].  If  is  diagonal,  then  (as  DV  is  SLT) 

1 

Tr  =  0  for  every  bounded  SLT  matrix  D. 

1 

If  Tr  =  0  for  every  bounded  SLT  matrix  D,  taike  i,j  with  i  >  j. 
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Let  D{k£)  =  0  unless  k  =  1,  i  =  i: 

D{ij)  =  1. 

Then,  Tr  DV(V  M(ji)  =  0.  This  shows  that  must  be 

-1  w— 1 

lower  triangular,  so  that  V  RyV  must  also  be  lower  triangular.  Since 

-1  i«— 1 

V  RjjV  is  symmetric,  V 

In  the  above  development,  we  have  implicitly  assumed  that  a  always  exists 
2 

to  solve  the  equation  a  P  =  P  +  A(a).  This  is  not  obvious,  as  the  subspace 

K_{aX+N)  changes  with  a.  Here  we  will  show  that  a  lower  bound  exists;  i.e., 

>  1  and  a  bounded  SLT  D  such  that  lla^V  -  D(aj^X+N)ll^  =  P. 

-1  2 

Let  D  be  the  optimum  SLT  matrix  to  minimize  IIV  X  -  D(X+N)II^.  Then 

Tr  =  Tr  DR^D**  +  Tr  DRj^D**  =  A(l).  Take  a  /  0.  Then  llaV~^X  -  D(aX+N)ll’*  = 

a^{P  -  2Tr  DR^V*^^  +  Tr  DR^D**)  +  Tr  If  P  -  2Tr  DR^V**"^  +  Tr  DR^D**  ji  0, 

-12  2 

then  one  can  set  P  =  llaV  X  -  D(aX+N)ll^  and  solve  for  a  ,  obtaining 
2  P-Tr  DRj^D** 

a  =  - - -  ,  giving  a  >  1. 

P-Tr  DR^D  -A(l) 

To  see  that  P  -  2Tr  +  Tr  ^  0  when  V‘^R^V**~^  is  not 

-1  2 

diagonal,  we  note  that  if  inequality  does  not  hold,  then  IIV  X  -  D(X+N)II^  = 

P  -  A(l)  =  Tr  DRjjD**.  Similarly,  for  any  a  /  0,  llaV  ^X  -  D{ciX+N)ll^  = 
a^(P-2Tr  DR^V^^+Tr  DRj^D^)  +  Tr  DRj^D*  =  Tr  DR^^D**.  Thus,  P  -  A(a)  <  Tr  DR^^D**  = 

P  -  A{1),  or  A(a)  >  A(l),  all  a  ^  0.  This  cannot  hold  if  V  is  not 

2  -1  H-1 

diaigonal ,  since  A(a)  <  a  P,  £ind  A(l)  ^  0  when  V  R^V  is  not  diagonal. 

We  have  now  shown  that  causal  linear  feedback  can  increase  capacity 
- 1  **-l 

provided  that  V  R^V  is  not  diagonal,  where  R^  is  the  optimum  message 
covariance  matrix  in  the  no-feedback  problem  (whenever  capacity  can  be 
attained  in  the  no-feedback  case).  These  conditions  need  to  be  converted  into 
conditions  on  the  noise  covarizuice  nnatrix  R„  emd  the  constraint  matrix  R^. 


R^V  must  be  diagonal. 
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This  will  be  done  in  the  next  two  sections,  treating  the  IR  and  2^  channels. 


Finite-Dimensional  Channel 

From  Theorem  1  of  [4],  the  optimum  no-feedback  message  has  covarieuice 
matrix  given  by 

h  =  ^  ^  Wn®4^n’ 

'■i=l  •'n=l  m=l 

where  {u^,  n  <  K}  are  o.n.  eigenvectors  of  S  corresponding  to  the  increasing 
sequence  of  eigenvalues  iP^) ,  and  J  is  the  largest  integer  i  K  such  that 

P  +  2  j3j  >  J/3j.  Let  L  be  the  unitary  operator  in  2^  such  that  =  VL**.  Then 

=  4  sfip.  +  P-  P  1(l\  )»(L^u  ). 

^  ‘^n=iLi=i^ 

Now,  V  =  L**(I+S)L,  and  Su^  =  ^n'^n’ 

l’*(I+S)Ll\  =  L*‘(I+S)u^  =  (1+P^)l\: 

i.e.,  {L  u^,  n  <  K}  are  c.o.n.  eigenvectors  of  V  R^W  corresponding  to  the 

—  1  1 

sequence  of  eigenvalues  (l+P^).  n  ^  K.  V  R^V  is  then  diaigonal  if  and  only 
if  {L  u^,  n  <  J}  can  be  teiken  as  natural  basis  vectors,  proving  (1)  of  the 
Theorem.  For  all  sufficiently  small  P  >  0, 


V  R^V  =  rr  2  L  u  0L  u  , 
A  M  ,  n  n 

n=l 


where  M  is  the  multiplicity  of  the  eigenvalue  P^  of  S,  Eind  of  the  eigenvalue 

1  +  Pj  of  V  Thus,  V  ^  cannot  be  diagonal  if  {L**u^,  n  <  M} 

cannot  be  taken  as  natural  basis  vectors.  For  larger  values  of  P.  when  R^^  has 

—  1  I 

the  representation  given  above  for  J  >  M,  then  the  eigenvectors  of  V  R^V 
H  ~  X  1 

must  include  {L  u^,  n  ^  M} .  Now,  if  V  R^^V  is  diagonal,  then  it  must  have  a 
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c.o.n.  set  of  eigenvectors  consisting  of  natural  basis  vectors.  However, 


span{L  u^,  n  <  M}  cannot  be  spanned  by  M  natural  basis  vectors,  so  that 


”1  X- 1 

V  cannot  have  a  c.o.n.  set  of  eigenvectors  consisting  entirely  of 


natural  basis  vectors.  This  shows  that  C^(P)  >  C^(P)  for  every  P  >  0  if  the 
M-dimensional  eigenmanifold  of  is  not  spanned  by  M  natural  basis  vectors. 

By  letting  P  become  sufficiently  large.  J  =  K,  and  then  the  above 


“1  “1  1 
expressions  show  that  V  R^V  will  be  diagonal  if  and  only  if  V  R^V  is 

r  K  T  _1 

f  An 


diagonal :  V  ^  ^ 


2/3.  +  P  +  K 


i=l 


I  -  V  RjjjV  .  This  proves  the 


sufficient  conditions  of  the  Corollary. 


— 1  1 

To  see  that  capacity  cannot  be  increased  by  causal  feedback  if  V 


is  diagonal,  one  notes  that  the  feedback  capacity  problem  is  that  of 


maximizing  I(X.  X+N)  subject  to  the  constraint  E1IV~^D{X,N)  11^  i  P,  where 


2  ^2 

llxll  =  2  X.  and  D  is  a  possibly  non-linear  operator  depending  only  on  the 

i=l 


F>ast  of  the  second  coordinate  (causal):  [D(x.y)]^  =  D”(x, [y^ .yg, . . ,y^  j]), 


-n 


where  D“  maps  ^  into  R.  Write  the  constraint  as  EIIV~^F{T,Z)II^  <  P,  where 


T  =  V  ^X,  Z  =  V  ^N,  and  F(x,y)  =  D[Vx,Vy].  Since  V  is  lower- triangular  emd  D 


is  causal  in  the  second  coordinate,  F  is  also  causal  in  the  second  coordinate. 


I{X,Y)  =  I(V  ^X,  V  ^Y)  =  I[V  ^X.  V  ^D{X.N)  +  V  ^N]  =  I[T.  V  V{T,Z)  +  Z].  The 


constraint  is  EIIV  V(T,Z)II^  <  P,  and  V  ^F(x,y)  is  a  causal  function  of  y. 


However,  Z  has  covariance  matrix  V  Rj^V  .  Thus,  if  V  Rj^V  is  diagonal. 


the  original  problem  is  equivalent  to  the  capacity  problem  with  causal 
feedback  when  the  channel  is  without  memory.  It  is  well-known  that  capacity 
cannot  be  increased  in  this  case. 

This  completes  the  proof  of  the  theorem  emd  corollary  for  the  finite¬ 
dimensional  channel. 
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jr.  »r^/-^w^;,/ 


Infinite-Dimensional  Channel  {i^) 

First,  assume  that  {A^.  n>l}  is  not  empty.  Several  cases  need  to  be 

considered.  The  various  expressions  for  the  optimum  (when  it  exists)  tuid 

the  value  of  C^(P)  are  taken  from  [4]. 

(1)  2  (0-A  )  <  00. 

n>l 

If  P  <  2  (0-X  ),  then  there  exists  finite  J  such  that  the  optimum  no- 

n^l  ” 

feedback  covariance  is  given  by  [4.  Theorem  3], 

“x  '  HAn  "  '■]  <“) 

1=1  ■‘n=l  'n=l  ■' 

As  in  the  finite-dimensional  channel,  this  shows  that  VR^V  will  not  be 

“1  1 

diagonal  if  the  subspace  spanned  by  the  eigenvectors  of  V  Rj^V 

1  1 

corresponding  to  the  sequence  of  eigenvalues  (1  +  Xj^,  k  ^  J)  of  V  R^^V 
does  not  contain  a  basis  consisting  entirely  of  natural  basis  vectors  which 

-  _  1  _  _  -it—  1 

are  eigenvectors  of  V  R^V 

®  j 

If  P  =  2  (0-X  ),  then  JX.  ^  P  +  2  X  for  every  X,  [4],  and  an  optimum 

n=l  ”  i=l  ^ 

messaige  covariance  exists  and  is  given  by 


J-  X 


R^=  2  (0-X^)R^e^»R^e^. 

n>l 

This  gives 

n>l 

which  is  clearly  diagonal  if  and  only  if  {L  e^,  n  ^  1}  can  be  taken  as  natural 
basis  vectors. 

If  (^j^)  is  an  infinite  sequence  and  P  >  2^(0-X^),  then  capacity  cannot  be 

attained  in  the  no-feedback  case.  However,  the  capacity  is  given  by  lim  I^(P), 

Khoo  * 
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where  is  the  value  of  I{X^,X^+N)  when 


pK  1 

^  =  K 


K 

2  X.  +  P 
i=l  ^ 


Let  Aj^(l)  be  the  squared  norm  of  the  projection  of  V  X  onto 

H_{X^+N) .  If  limsup  Aj.(l)  >  0,  then  as  before  the  capacity  can  be  increased. 
K  ^ 

That  is,  we  choose  K  sufficiently  large  so  that  I(aX^,  ctX^+N)  >  C^(P),  where 


a  >  1,  a  >  cij^{l).  with  <^{1)  satisfying  Oj^  = 


P  -  Tr 


Wk 


<•  -  ‘>kVk  -  ‘k<‘) 


,  with  Dj^ 


—  IK  K  2 

the  bounded  SLT  matrix  that  minimizes  ElIV  X  -  Dj^(X  +N)II  .  The  problem  is  now 

reduced  to  showing  that  Aj^(I)  0  cannot  hold  if  there  exists  some  J  such  that 

{L  e  .  n  <  J}  cannot  be  tzdten  to  be  natural  basis  vectors.  Suppose  such  J 
n 

exists  and  take  K  >  J. 

K  J  J 

Write  X  =  Xqj^  +  X^j^.  where  X^j^  is  the  zero-meam  Gaussian  process  with 
K  J 

covariance  matrix 


K  T  1  J  r  K 

X 


n=l  ^i  =  l 


2  +  P  -  KX 


J  K  I 

and  Xqj^  is  independent  of  ^qj^-  As  K  converges  in  the  operator  norm 


J. 

J2., 


J. 

J2, 


topology  to  2  {0-A  )Rme  ®RSe  ,  using  the  fact  that  2  (9-X  )  <  P.  Now  suppose 
„n  wnwn  viH 

n=l  n>l 


-IvJ 


that  Aj^{l)  -♦  0.  This  requires  that  E<V  X^^  +  V  B(X^p+X^j^+N) >  -»  0  for 


OK 


OK’ 


'OK  OK 


every  fixed  bounded  SLT  matrix  B.  Since  X^j^  and  X^j^  are  independent,  this 
implies  that  E<V  BXqj^>  0  for  every  bounded  SLT  B,  or 


X 

>2. 


J. 

>2^ 


{Trace  BR^”^V^^}  -»  0.  Since  2  ^  *^his  implies 

n=l 


MM 

(as  in  the  proof  of  the  Proposition)  that  2  (0-X  )L  e^®L  e^  is  diagonal.  This 

n=l  ” 


cannot  be.  since  by  assumption  {L  e^,  n  <  J}  cannot  be  taken  as  natural  basis 


12/3.''87  -  15 


vectors.  This  shows  that  optimum  feedback  will  increase  capacity  when 

P  >  I  (6-\  )  and  (X  )  is  an  infinite  sequence. 
n>l  ^  ” 

Finally,  suppose  that  (X  )  is  a  finite  sequence,  X.  <  X„  <  ..  <  X„.  and 

n  1  ^  K 

K 

P  >  2  (9-X^).  Then  there  exists  an  infinite  o.n.  set  {u^.  n>l}  such  that 

ll(S-0I)u^ll  -*  0  and  u^  1  span{ej . for  all  n  ^  1  [6]. 

Fix  M  <  ®  amd  take  e  >  0  such  that  0  -  Xj^  >  e. 

M  M 

Let  Xj  be  the  zero-mean  Gaussian  process  with  covariance  matrix  R™  given 


„  K  r2^_,X.  +  p'f  -  KX  T  x  j. 
‘  „=1  [  ^ 


where  P^  <  P. 

^  M  £  M  £ 

Choose  o.n.  vectors  u^ '  ,...,02'  from  the  set  {u^,  n  >  1}  such  that 
|<(S-0I)Uj’  ,  u^'^>|  ^  £  for  i  <  M.  Let  be  the  zero-mean  Gaussian  process 

with  covariance  matrix  given  by 

P-P^ 

dM.£  _ 1  _  J  M,£„n2il  M.£ 

^2  ~  M(1+0+£)  • 

Now,  let  be  the  zero-mean  Gaussian  process  with  covariaince  matrix 

R^’^,  where  R^'^  =  +  ^2  Since  u^'^  is  orthogonal  to  span{e^ , . . , for 

i  <  M, 


i{xM.£  x”-Sn)  =  i{Xj.  Xj+N)  +  ux”’^.  X^'^+N) 

K  X  +  P”  -  KX  1  P  -  p” 

=  *  - K(Wg - J  *  »  "  ‘“Si'  *  M(ue'e) 

M  &  M 

X  ’  satisfies  the  constraint  for  any  Py  <  P.  since 
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EIIX^'^llJ  =  Trace  ^  j^ace  R^^(Rj+r”-^)R^2 


P  -  P^ 


Now.  define 


P” 


by 


K 

KP  -  (M-K)  2  X.  +  {M-K)K0 
1=1  ^ 

Then 


P^=M-^ 


I{Xj.  x”+N) 


K  rMK  +  ^i.l\  +  KP  +  {M-K)K0 
2  2  log' 
n=l 


MK{1+X^) 


As  M  -♦  ®, 


I(x”.  Xj+N) 


K 


rl  +  0  -I 
n=l  n-* 


Simi larly , 


UX”’^.  X2’^+N)  =  t  M  log 


rM^(l+0+fe)  +  (M-K)[P  +  2^_^ 


M^(l+0+e) 


As  M  -»  o®. 


I{X„  .  X„  +N)  — + -  . 

2(1+0+£) 

Thus,  I(X^'^,  X^’^+N)  converges,  as  M  -+  ®,  to 


K 

t  2  log 
n=l 


1  +  0 


1  +  X 

n 


P  +  2^  ,(X.-0) 
1=1'  1  ' 


1  +  0  +  e 


From  Theorem  3  of  [4],  the  capacity  Cy^(P)  is  equal  to 


K 

t  2  log 
n=l 


f  1 


1  +  X  J  ■  2 

n-* 


1  +  0 
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so  that  by  choosing  e  sufficiently  small  and  M  sufficiently  large,  one  can 
obtain  x”’^+N)  arbitrarily  near  C^(P) 

As  e  -»  0  emd  M  V  ^  converges  to  a  diagonal  matrix,  since 


I <(S-0I )u^’ ^ ,  u^'^>|  <  £  for  i  <  M. 


However , 


1  ,  K  n  n 

n=l  ^ 


This  matrix  will  not  be  diagonal  if  {L  e^.  n  ^  K}  camnot  be  taken  to  consist 

entirely  of  natural  basis  vectors.  This  is  equivalent  to  not  having  K  natural 

—  1  1 

basis  vectors  as  eigenvectors  of  V  corresponding  to  the  sequence  of 

M 

eigenvalues  (1  +  k  <  K) .  Inserting  the  above  definition  of  P^, 


K  K  1 

=  I  (9-X  )  +  IP  •  I  X  .  -  KelM'ML'e  »L"e  .  Thi 
‘  L  J  =  ,  -1  J  J  n  „ 


s  matrix  is 


independent  of  e.;  as  M  ^  it  converges  in  the  operator  norm  topology  to 


—  1  1  *  *  MM 

V  R^V  =  1  (9-X  )L  e^®L  e  .  Similar  to  the  preceding  part  of  the  proof,  we 

n=l 

_  1  ^ 

now  consider  g,(l)'  ^he  squared  H(S2*h)  norm  of  the  projection  of  V  X  ' 

onto  H  (X^’^+N).  A„  (1)  -*  0  as  M  "  implies,  as  in  the  preceding  part  of  the 
”*  M ,  6 

—  1  M—  1 

proof,  that  V  R^V  is  diagonal.  This  is  a  contradiction.  In  fact,  Aj^j  ^(1) 

.  ,  P  .2  P  -  Tr 

IS  bounded  away  from  zero.  Define  ^  ^  - - ryy  with 

M ,  fc*7r  M ,  e  M ,  e '  ' 

D„  (X^’^+N)  the  projection  of  V  onto  H  (X^’^+N).  Since 

I(X^'^.  X^’^+N)  -♦  C^(P)  as  e  -+ 0,  M  ^  and  A^^  ^^(0  is  bounded  away  from 

zero,  we  obtain  1(0^^  ^  ■  This  completes  the  proof  of 

sufficiency  in  part  (2)  of  the  Theorem  when  ^  “• 

(2)  2  {0-X  )  =  ". 

'  '  n' 

In  this  case,  P  <  2  (0-\  )  for  all  P  >  0.  capacity  is  attained  in  the  no- 

n'  n^ 

feedback  case  for  every  P  >  0,  and  for  each  P  >  0  there  exists  J  <  "  (the 
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value  of  J  depending  on  P)  such  that  the  optimum  message  covariauice  matrix  R 

is  given  as  in  («).  As  in  case  (1),  it  is  clear  that  feedback  can  increase 

capacity  if  the  set  of  eigenvectors  corresponding  to  the  sequence  of 

-1  «-i 

eigenvalues  (1  +  k  <  J)  of  V  does  not  contain  J  natural  basis 

vectors . 

This  completes  the  proof  of  (2)  of  the  Theorem.  The  proof  of  (2)  of  the 
Corollary  follows  from  (2)  of  the  Theorem,  in  the  same  way  that  (1)  of  the 
Corollary  was  obtained.  □ 


Verification  of  the  Sufficient  Conditions 


Verification  of  the  sufficient  conditions  given  in  the  Theorem  is 

equivalent  to  determining  the  value  of  C^(P),  as  can  be  seen  from  the 

expressions  for  <^(P)  [4].  The  difficulty  of  verifying  the  sufficient 

conditions  of  the  Corollary  is  considerably  less  than  for  the  Theorem.  We  now 

summarize  how  one  can  verify  that  C^(P)  >  C^CP)  for  all  P  >  0.  This  will  be 

done  by  giving  conditions  that  are  equivalent  to  the  conditions  given  in  (1) 

Euid  (2)  of  the  Corollary  for  CyypCP)  >  C^y{P)  for  all  P  >  0. 

—  1 

Suppose  that  V  R^V  is  nondiagonal.  Write 
-1 

V  'Rj^V  '  =  a  -  d 

where  D  is  a  diagonal  matrix  whose  non-zero  elements  D{ i , i )  are  the  diagonal 
elements  -r_  of  V  ^Rj^V^  ^  such  that  {V  ^Rj^V**  ^){ij)  =  (V  ^Rj,^V^  ^){ji)  =  0  for 
all  j  /  i.  C^p(P)  >  C^(P)  for  all  P  >  0  if  the  following  conditions  are 
satisf ied. 

1.  Finite-Dimensional  Chzuinel. 


inf  <Ax.x>  i  inf  {D{i,i):  D{i.i)  >  0} ; 
11x11  =  1 
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2. 


Infinite-Dimensional  Cliannel. 


(a)  inf  <Ax,x>  <  inf  {D(i,i):  D(i,i)  >  0}: 

11x11=1 

-1  1 

(b)  inf  <Ax,x>  is  an  eigenvalue  of  V  R  V  of  finite  multiplicity: 
11x11  =  1 

and 


-1  1 

(c)  if  Hq  is  the  subspace  spanned  by  the  eigenvectors  of  V 

corresponding  to  the  eigenvalue  inf  <Ax,x>,  then 

11x11=1 

inf  <Ax,x>  >  inf  <Ax,x>. 

11x11  =  1  11x11=1 

X  in 


To  see  that  (2)  implies  the  corresponding  sufficient  condition  in  (2)  of 

the  Corollary,  one  can  verify  that  (2a)  aind  (2b)  imply  that  the  smallest 
-1  ^1 

eigenvalue  of  V  Rj^V  exists  auod  does  not  have  eigenspace  containing  a  set 
of  natural  basis  vectors  complete  for  the  subspace:  (2b)  shows  that  the 
multiplicity  of  this  subspace  is  finite;  and  (2b)  plus  (2c)  show  that  this 
eigenvalue  is  not  the  limit  of  a  sequence  of  distinct  eigenvalues. 


These  conditions  are  not  complex.  Consider  the  finite-dimensional 

-1  »*-l 

channel.  First,  one  inspects  the  matrix  V  R^^^V  and  locates  the  diagonal 

elements  such  that  the  i^  row  and  i^  column  are  all  zero  except  for  the 

ii  element.  Denote  these  elements  as  tt^.  This  is  the  set  of  eigenvalues  of 
-1  1 

V  Rj^V  corresponding  to  natural  basis  vectors  as  eigenvectors.  If  the 

-1  ♦*-! 

smallest  such  y.  is  strictly  greater  than  inf  <V  R^V  x.x>,  then  the 
^  11x11=1 

-1  »*-l 

smallest  eigenvalue  of  V  R^^V  has  no  eigenvectors  that  are  natural  basis 
vectors,  and  so  C^(P)  >  C^(P)  lot  all  P  >  0.  If  the  smallest  t.  is  equal  to 

-  1  M-  1 

inf  <V  R„V  x.x>,  then  one  must  determine  the  multiplicity  of  inf  y .  - 

..  I.  ,  N  ^  .  1  0 

11x11  =  1  1 
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-1  »«-l 

as  an  eigenvalue  of  V  Rj^^V  .  If  this  multiplicity  is  strictly  greater  than 
the  number  of  times  appears  among  the  {nr.,  i  >  1},  then  again  C^{P)  > 
C^{P)  for  all  P  >  0. 

Necessary  Conditions 

The  Corollary  shows  that  the  sufficient  condition  for  feedback  to 
increase  capacity  for  all  sufficiently  large  P  is  also  necessary,  in  the  case 
of  the  finite-dimensional  channel.  Although  the  emphasis  here  has  been  on 
sufficient  conditions,  it  is  our  conjecture  that  each  of  the  four  sufficient 
conditions  given  in  the  Corollary  is  also  a  necessary  condition  for  the  same 
resul t . 


Concluding  Remarks 

It  can  be  seen  that  the  capacity  problem  with  feedback  for  small  P 

reduces  to  consideration  of  the  eigenmanifold  for  the  smallest  eigenvalue  of 
-1  »♦-! 

V  Rj^V  ,  for  the  finite-dimensional  channel.  If  this  eigenvalue  has 
multiplicity  one,  then  feedback  can  increase  capacity  for  every  value  of  P  if 
the  corresponding  eigenvector  is  not  a  natural  basis  vector. 

In  the  case  of  the  infinite-dimensional  channel,  the  seime  situation 
holds,  except  that  the  additional  requirement  is  imposed  of  having  the 
smallest  eigenvalue  be  strictly  less  than  the  smallest  limit  point  of 

For  the  case  of  sufficiently  large  P.  the  problem  can  be  couched  in  terms 
of  the  reproducing  kernel  Hilbert  space  of  R^.  say  H^.  If  the  Gaussian  cylin¬ 
der  set  measure  p  on  defined  by  =  poj  j  the  natural  injection  of 
into  ^2  (ie.,  jx  is  just  x  viewed  as  an  element  of  rather  than  as  an  ele¬ 
ment  of  H^),  has  diagonal  covariance  operator,  then  C^{P)  =  C(P):  otherwise. 
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C^(P)  >  sufficiently  large  P.  In  essence,  this  states  that 

capacity  can  be  increased  by  feedback  for  all  sufficiently  large  P  if  the 
noise  is  correlated  when  it  is  viewed  as  belonging  to  rather  than  to 

The  setup  given  here  is  rather  general.  Eind  an  obvious  extension  is  to 
apply  the  same  approach  to  the  time-continuous  channel.  However,  the  structure 
of  (Hi Iber t-Schmidt)  Volterra  operators  is  more  complicated  in  L2[0.T]  than  in 
^2"  tmd  an  arbitrary  covarizmce  operator  in  L2[0.T]  may  not  have  a  causal 
decomposition  of  the  form  =  W  .  V  Volterra.  Thus,  a  complete  extension  of 
these  results  in  the  form  stated  here  does  not  seem  possible. 
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